Intelligent System for Entities Extractions (ISEE) from Natural Language Texts

نویسندگان

  • Igor P. Kuznetsov
  • Elena B. Kozerenko
  • Konstantin I. Kuznetsov
چکیده

This paper describes a semantic linguistic processor which extracts the entities and their links from natural language texts. The conceptual model underlying the algorithmic developments is the extended semantic networks (ESN). This paper analyzes the use of the processor for text formalization in various subject fields: economy monitoring, criminal actions, mass media, terrorist activities (in Russian and English). Peculiarities of the texts are taken into account by linguistic knowledge of the processor: the system can be tuned to various subject areas. We describe the use of this processor for text formalization in different subject areas, such as economic crisis monitoring, criminology (summary of incidents, accusatory conclusions, etc.), the mass media documents about terrorist activities, personnel management (autobiographies, resume). Special features of each problem area are examined: the collections of extracted entities, the means for their identification, their connections, occurring contractions, punctuation and special signs, specific character of language constructions, etc. all these special features were taken into account in the linguistic knowledge development.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

سیستم شناسایی و طبقه‌بندی موجودیت‌های اسمی در متون زبان فارسی بر پایه شبکه عصبی

Named Entity Recognition (NER) is a fundamental task in natural language processing and also known as a subset of information extraction. We seek to locate and classify named entities in text into predefined categories such as the names of persons, organizations, locations, expressions of times, etc. Named Entity Recognition for English texts has been researched widely for the past years, howev...

متن کامل

Detection and Recognition of Multi-language Traffic Sign Context by Intelligent Driver Assistance Systems

Design of a new intelligent driver assistance system based on traffic sign detection with Persian context is concerned in this paper. The primary aim of this system is to increase the precision of drivers in choosing their path with regard to traffic signs. To achieve this goal, a new framework that implements fuzzy logic was used to detect traffic signs in videos captured along a highway f...

متن کامل

ویرایش‌گر متن شریف: سامانۀ ویرایش و خطایابی املایی زبان فارسی

In this paper, we will introduce an intelligent system to edit and spell check Persian texts. The goal is editing and preprocessing Persian texts for natural language processing tasks. This system is based on an expandable and engineering approach and is composed of three subsystems: Persian text editor, spell checker and stemmer. These parts interact with each other to edit texts. To do this, ...

متن کامل

Natural Language Processing of Mathematical Texts in mArachna

mArachna is a technical framework designed for the extraction of mathematical knowledge from natural language texts. mArachna avoids the problems typically encountered in automated-reasoning based approaches through the use of natural language processing techniques taking advantage of the strict formalized language characterizing mathematical texts. Mathematical texts possess a strict internal ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009